DynaQ: online learning from imbalanced multi-class streams through dynamic sampling

نویسندگان

چکیده

Abstract Online supervised learning from fast-evolving data streams, particularly in domains such as health, the environment, and manufacturing, is a crucial research area. However, these often experience class imbalance, which can skew distributions. It essential for online algorithms to analyze large datasets real-time while accurately modeling rare or infrequent classes that may appear bursts. While methods have been proposed handle binary there lack of attention multi-class imbalanced settings with varying degrees imbalance evolving streams. In this paper, we present Dynamic Queues (DynaQ) algorithm fill knowledge gap. Our approach utilizes batch-based resampling method creates an instance queue each balance number instances. We maintain threshold remove older samples during training. Additionally, dynamically oversample minority based on one four rate parameters: recall, F1-score, $$\kappa _m$$ κ m , Euclidean distance. consists ensemble uses sliding windows soft voting schema incorporating drift detection mechanism. experimental results demonstrate superiority DynaQ over state-of-the-art methods.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Online Ensemble Learning for Imbalanced Data Streams

While both cost-sensitive learning and online learning have been studied extensively, the effort in simultaneously dealing with these two issues is limited. Aiming at this challenge task, a novel learning framework is proposed in this paper. The key idea is based on the fusion of online ensemble algorithms and the state of the art batch mode cost-sensitive bagging/boosting algorithms. Within th...

متن کامل

A multi-class boosting method for learning from imbalanced data

The acquisition of face images is usually limited due to policy and economy considerations, and hence the number of training examples of each subject varies greatly. The problem of face recognition with imbalanced training data has drawn attention of researchers and it is desirable to understand in what circumstances imbalanced data set affects the learning outcomes, and robust methods are need...

متن کامل

Learning to Classify Data Streams with Imbalanced Class Distributions

Streaming data is pervasive in a multitude of data mining applications. One fundamental problem in the task of mining streaming data is distributional drift over time. Streams may also exhibit high and varying degrees of class imbalance, which can further complicate the task. In scenarios like these, class imbalance is particularly difficult to overcome and has not been as thoroughly studied. I...

متن کامل

Online Imbalanced Learning with Kernels

Imbalanced learning, or learning from imbalanced data, is a challenging problem in both academy and industry. Nowadays, the streaming imbalanced data become popular and trigger the volume, velocity, and variety issues of learning from these data. To tackle these issues, online learning algorithms are proposed to learn a linear classifier via maximizing the AUC score. However, the developed line...

متن کامل

Online Multi-Task Learning Using Active Sampling

One of the long-standing challenges in Artificial Intelligence for goal-directed behavior is to build a single agent which can solve multiple tasks. Recent progress in multi-task learning for goal-directed sequential tasks has been in the form of distillation based learning wherein a single student network learns from multiple task-specific expert networks by mimicking the task-specific policie...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Applied Intelligence

سال: 2023

ISSN: ['0924-669X', '1573-7497']

DOI: https://doi.org/10.1007/s10489-023-04886-w